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Abstract 

Background: During adaptive radiation events, characters can arise multiple times due to parallel evolution, but transfer of 
traits through hybridization provides an alternative explanation for the same character appearing in apparently non-sister 
lineages. The signature of hybridization can be detected in incongruence between phylogenies derived from different 
markers, or from the presence of two divergent versions of a nuclear marker such as ITS within one individual. 

Methodology/Principal Findings: In this study, we cloned and sequenced ITS regions for 30 species of the genus Rheum, 
and compared them with a cpDNA phylogeny. Seven species contained two divergent copies of ITS that resolved in 
different clades from one another in each case, indicating hybridization events too recent for concerted evolution to have 
homogenised the ITS sequences. Hybridization was also indicated in at least two further species via incongruence in their 
position between ITS and cpDNA phylogenies. None of the ITS sequences present in these nine species matched those 
detected in any other species, which provides tentative evidence against recent introgression as an explanation. Rheum 
globulosum, previously indicated by cpDNA to represent an independent origin of decumbent habit, is indicated by ITS to 
be part of clade of decumbent species, which acquired cpDNA of another clade via hybridization. However decumbent and 
glasshouse morphology are confirmed to have arisen three and two times, respectively. 

Conclusions:These findings suggested that hybridization among QTP species of Rheum has been extensive, and that a role 
of hybridization in diversification of Rheum requires investigation. 
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Introduction 

Adaptive radiation events are a significant source of new species, 
ecological diversity and morphological innovation [1-4]. Although 
such events are well known on oceanic islands [3,5-6], they may 
also occur on continental landmasses if significant climatic and/ or 
geological upheavals have created new ecological niches [4,7-8]. 
One event implicated in many such radiations is the uplift of the 
Qinghai-Tibetan Plateau (QTP) [9-14]. 

Hybridization has been suggested to have gready contributed to 
the adaptive radiation of many genera that have large numbers of 
species within a restricted distributional range [5-6,15-17], and 
the genetic signature of hybridization is visible in many such 
genera [9,18-23]. Reproductive isolation is often incomplete 
within species groups derived by recent rapid radiation, permitting 
hybrid formation [24-25] . Moreover, hybrid speciation can confer 
tolerance of new habitats [26-27], and may hence contribute to 



radiation events where many new niches are available to colonise 
[28-32] or even trigger them [24]. However, determining the 
exact role of hybridization in adaptive radiation events remains 
challenging. 

Rheum L. (Polygonaceae) contains ~60 species, mainly distrib- 
uted in the QTP and adjacent regions [33-35]. This diversity 
appears to result from two radiation events, the first around 9.9- 
12.0 million years ago (Mya) and the second around 5 Mya 
[14,36]. There is extensive morphological and ecological variation 
between species [33-35], and certain adaptive traits, such as 
decumbent habit and "glasshouse" morphology involving trans- 
lucent bracts, have evolved multiple times in parallel [14,37—39]. 
However, reticulate evolution due to hybridization might give the 
impression of a character evolving multiple times, when in fact it 
only evolved once. However, a possible role for hybridization in 
the diversification of Rheum has not yet been investigated. 
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Furthermore, at least one polyploidization event has occurred, 
because some species are tetraploid [40-43]. 

The nuclear rDNA internal transcribed spacer (ITS) region, is a 
universal species-specific marker for plants and fungus [44-47] 
and a popular marker for phylogenetic reconstructions [48-52] 
and biogeographic or other evolutionary fields [53-60]. Typically 
several hundred copies exist within plant genomes, which means 
that the signature of autopolyploidization [52,61-64] or recent 
hybridization or introgression [63-66] can be detected, unless 
enough generations have passed for concerted evolution to 
homogenise all ITS copies within the genom(; [48,67]. The 
signature of past hybridization can also be detected by incongru- 
ence with a plastid phylogeny, which already exists for Rheum [14]. 

In this study, we extended our previous examination of the 
diversification history of the genus Rheum [10,14,37,39,68]. We 
cloned and sequenced ITS sequences for 30 Rheum species, 
representing all seven sections of this genus. We sought evidence of 
hybridization via (i) multiple ITS copies and (ii) incongruence 
between ITS and cpDNA, and used this data to evaluate the 
extent of hybridization during diversification in Rheum. 

Materials and Methods 

2.1 Plant materials 

We collected one accession each of 30 species, representing 
seven sections of Rheum (Table 1). Most of the species examined by 
Sun et al. [14] for cpDNA were included, except for four which 
could not be obtained: R. acuminatum, R. delavayi, R. palaestinum and 
R. tataricum. Most species samples were collected in the QTP, and 
voucher specimens were deposited in the herbaria of Northwest 
Plateau Institute of Biology (HI^VP), the Chinese Academy of 
Science, and School of life Sciences, Lanzhou University, China 
(Table 1). We selected nine species from five other genera of the 
Polygonaceae, plus Limonium sinense from the Plumbaginaceae, to 
serve as outgroups to root our phylogenetic analyses. 

2.2 DNA extraction, amplification, cloning and 
sequencing 

Total DNA was extracted from silica dried leaves using a 
modified CTAB method [69] . The primers used for amplification 
were ITSl (5'-TCCGTAGGTGAACCTGCGG-3') and ITS4 
(5'-TCCTCCGCTTATTGATATGC-3') [70]. The amplification 
program consisted of an initial template denaturation step at 95 for 
5 min, followed by 38 cycles at 94 for 20 s, 50 for 30 s, and 72 for 
40 s, and extension at 72 for 5 min. We firstiy sequenced the 
amplification products directly. However, most of these sequences 
can not be identified exactly, especially for seven species, i.e. R. 
hotaoense, R. ojjicinale, R. tanguticum, R. pumilum, R. likiangense, R. 
franzenhachii and R. reticulatum, mainly because there are a lot of 
impurity peaks. So for each of the 30 Rheum species examined, the 
amplified fragments were then ligated and transformed into 
Escherichia coli strain DH5a system using a pUC18 vector (Takara 
Inc.). At least 15 positive clones for each species were selected and 
sequenced on an ABI Prism automated sequencer with universal 
primers M13rev and M13uni. In order to reduce false base 
substitutions resulting from PGR polymerase mismatch, any 
polymorphism that was observed in only one clone was removed 
from the analyses. Sequences were edited and aligned with 
MegAlign and manually adjusted at two positions that had minor 
length polymorphism. AH sequences have been submitted to 
GenBank (Table SI in File SI). 



2.3 Data analyses 

The boundaries of the sequenced ITS (including ITSl. 5.8 S, 
and ITS2) regions were identified in comparison to Rumex cripus 
ITS sequence from GenBank (Accession number: AF338221), and 
ITS2 boundaries and alignments were confirmed by Hidden 
Markov Models, following Keller et al. [71]. Clustal W was used 
for alignment of sequences initially [72], BioEdit v 5.0.6. [73] was 
used to refine the alignments manually and to determine the 
lengths and GC contents of ITSl, 5.8 S, and ITS2 separately. The 
repeats in the ITS sequences were detected with the Tandem 
Repeats Finder [74]. For each species, sequences for the eight 
regions of cpDNA examined by Sun et al.[14] were retrieved 
(Table SI in File SI). Thus, two sequence matrixes (ITS and 
cpDNA) were used in further analyses. All indels in the two 
matrixes were coded as binary states (0 for absence, and 1 for 
presence) using the GAPCODER program [75]. 

2.4. Phylogeny analyses 

Phylogeny analyses were conducted based on the sequences of 
entire ITS region, ITS 1 region and ITS2 region, respectively. We 
used MrModcltcst 2.0 [76] to clioosc the most appropriate model 
for each dataset for the ML and Bayesian analyses (the selected 
model was GTR -I- 1 -I- G for each analysis). Maximum likelihood 
analyses were performed using PHYML 3.0 with 1000 bootstraps 
under the GTRIG model [77]; partial parameterization com- 
mands used here: -b 1000 -m GTR -v e -f e -t e -a e -o tir, see 
the PHYML manual for detail. MrBayes 3.1.2 was used to 
perform the Bayesian inference analysis to find the optimal tree 
topology [78]. Four runs were made, each to 10 million 
generations, saving every 1000th tree. The parameters of the 
selected model were optimized during searches as recommended, 
running two independent Markov chain Monte Carlo (MCMC) 
chains with one cold and three hot chain searches for each dataset 
with a 50% 'burn-in'. MCMC convergence was also explored by 
examining the Potential Scale Reduction Factor convergence 
diagnostics [79] for all parameters in the model. The posterior 
probabilities indicating support values for each branch were also 
estimated. 

2.5. Genetic distance between species and between ITS 
versions 

Based on the elfects of incomplete concerted evolution of ITS 
sequences, hybridization can be inferred in the ancestry of any 
individual that contains copies of ITS that are less similar to one 
another than each is to a sequence present in another species [80] . 
Seven species contained multiple copies of ITS, i.e. R. hotaoense, R. 
officinale, R. tanguticum, R. pumilum, R. likiangense, R. Jranzenbachii and 
R. reticulatum. We used MEGA5 [81] to compute the p-distance 
between the two ITS versions within each of these species, and 
that between each version and the most similar sequence detected 
across all species. This was done using the minimum differences 
between sequences. 

2.6. Checking for pseudogenes 

Where multiple ITS copies were detected, we checked each 
copy for two characters that might indicate it to be a pseudogene. 
The first was the presence of three 5.8 S motifs present which are 
necessary to give the ITS2 proximal stem connecting 5.8 S to 28 S 
its correct functional structure, and whose absence hence indicates 
pseudogenes [82-84]. These comprise a one seed plant specific 14- 
bp motif common to all seed plants [82], and two that are 
common to all angiosperms [83-84]. The second was lower GG 
content, which also characterises pseudogenes. 
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Table 1. List of taxa and 


sources of plant materials 


analyzed and accessions of ITS 


sequences In GenBank. 






GenBank accession 


Taxon 


Sources/voucher 


Chromosome 
number: 2n(n)/X 


ITS version 1 


ITS version 2 


Rheum L 


Sect. 1 Rheum 


R. webbianum Royle 


Cuori, Xizang/Sn31 


22/44 


KF258680 




R. hotaoense C. Y. Cheng et Kao 


Ledu, Qinghai/Y99130-1 


? 


KF258681 


KF258682 


R. australe D. Don 


Xizang, Deqing/Liu 1101 


22 


KF258683 




R. franzenbachii Miint 


Baotou, neimeng/Liu hb 


22 


KF258684 


KF258685 


R. wittrockii Lundstr. 


Yili, Xinjiang/Y 99059 


44 


KF258686 




R. forrestii Diels 


Dali, Yunnan/Liu 2175 


? 


KF258687 




R. likiangense (L.)Sam. 


Yushu, Qinghai/Q99147 


22 


KF258688 


KF258689 


R. lhasaense A. J. Li et P. K. Hsiao 


Sangri, Xizang/Liu 1133 


? 


KF258690 




R. compactum L. 


Hami, Xinjiang/X99006 


44 


KF258691 




R. rhaponticumi L. 


Geneva, Switzerland/GGOOl 


44 


KF258692 




R. altaicum A. Los. 


Aertai, Xinjiang/Liu xj 


44 


KF258693 




R. macrophyllum J. Q. Liu. 


Rikaze, Xizang/Liu6265 


22/44 


KF258694 




Sect. II Palmata A. Los. 


R officinale Bail! 


Nanchuan, Chongqing/991013 


44 


KF258695 


KF258696 


R. paimatum L 


Kangding, Sichuan/Liu 2082 


22 


KF258697 




R. tanguticum Maxim 


Gande, Qlnghai/Liu 1773 


22 


KF258698 


KF258699 


Sect. Ill Acuminata C. Y. Cheng et Kao 


R. kialense Franch 


Kangding, Sichuan/Liu 2050 


7 


KF258700 




Sect. IV Deserticola Maxim 


R. sublanceolatum Ciieng et Kao 


Chenduo, Qinghai/Liu 847 


7 


KF258701 




R. pumilum Maxim. 


Chenduo, Qinghai/Y 99145 


44 


KF258702 


KF258703 


R. nanum Slev. ex Pall. 


Balikun, Xinjiang/Y 99129-1 


22 


KF258704 




R. tibeticum Maxim, ex Hook.f. 


Qushui, Xizang/Liu 1112 


7 


KF258705 




Sect. VI Spiciformia A. Los. 


R. spiciforme Royle 


Yeduo, Qinghai/Liu 689 


22 


KF258706 




R. moocroftianum Royle 


Yeduo, Qinghai/Liu 688 


7 


KF258707 




R. przewaiskyi A. Los. 


Huzhu, Qinghai/Q99136 


7 


KF258708 




R. rhizostacbyum Schrenk 


Sunan, Gansu/Liu 1506 


7 


KF258709 




R. reticuiatum A. Los. 


Maduo, Qinghai/Liu 820 


22 


KF258710 


KF258711 


R. rhomboideum A. Los 


Yeduo, Qinghai/Liu ly 


22 


KF258712 




R. alpinum J. Q. Liu. 


Kangma, Xizang/Liu 6216 


7 


KF258713 




Sect. VII Globulosa C. Y. Cheng et Kao 


R. globulosum Gage 


Dazi, Xizang/SN221 


7 


KF258714 




Sect. VIII Nobilia A. Los. 


R. nobite Hook f. et Thoms 


Linzhi, Qinghai/Liu 1206 


22 


KF258715 




R. alexandrae Batal. 


Kangding, Sichuan/Liu 2051 


22 


KF258716 




OUTGROUPS. 


Polygonum viviparum L. 






GQ339919* 




Polygonum hooker! Meisn 






JN187112.1* 




Oxyria digyna (L.) Hill 


Wuding, Sichuan/Liu 2087 




FJ 154474* 




Oxyria sinensis Hemsl 






KF258717 




Rumex crispus L. 






AF338221* 




Calligonum rubicundum Bge. 






JN187107.1* 




Calligonum arborescens Litv. 






JN187105.1* 




Atraphaxis spinosa L. 






JN187102.1* 
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Table 1. Cont. 



GenBank accession 



Taxon Sources/voucher 


Chromosome 
number: 2n(n)/X 


ITS version 1 


ITS version 2 


Atraphaxis pungens (Bieb) Jaub. Et 
Spacht 




JN187100.1* 




Limonium sinense L. 




EU410356* 





•Indicates sequences that were retrieve from GenBank Dateabase (?, not affirmative or reported). Samples were collected in China except R. rhaponticuml. 
Chromosomal numbers reported by Jaretzky (1928), Chin and Youngken (1947), Hu et al. (2007) and Liu et al. (2010). 
doi:l 0.1 371 /journal.pone.0089769.t001 



Results 

3.1 ITS sequence variation 

The entire ITS region, including ITSl, tlie 5.8 S rDNA and 
ITS2, was amplified from at least 15 positive clones per Rheum 
species. Additive polymorphisms were positively identified only 
when each version of ITS were present in at least two clones per 
species. Seven accessions contained two highly divergent ITS 
versions , which differed from one another by at least 27 bases as 
follows: R. hotaoense (33 bases), R. officinale (27 bases), R. tanguticum 
(31 bases), R. pumilum (71 bases), R. likiangense (30 bases), R. 
franzenbachii (27 bases) and R. reticulatum (30 bases) (Figure 1). For 
each of those species, one or two additional accessions were 
examimed, and every accession was found to contain both versions 
with different frequency (Table S2 in File SI). In each species. 
There were two versions of ITS sequences between 2-3 
individuals, among sequences of each version, they were different 
at 2-4 bases, no identical ITS sequences were found, hence there 
was far greater similarity between individuals than between ITS 
versions. Furthermore, in each of these species, the two ITS 
versions were not sister to one another in phylogeiietic analysis; for 
each species, each version for formed a discrete clade not closely 
related to the other (see below). Single accessions of three other 



species, i.e R. nobile, R. nanum and R. alexandrae, each contained two 
barely divergent versions of ITS, that differed at between 1 and 3 
sites in each case, and which were strongly supported as reciprocal 
sister sequences in phylogenetic analysis. In the remaining 
accessions, only one version was detected. The possibility that 
any of these ITS sequences could reflect fungal contamination was 
eliminated following the methods of Li et al. [44] . 

The ITS sequences of Rheum species ranged from 5 1 8 bps {R. 
pumilum 2) to 590 bps [R. nobile]. The ITSl region ranged from 
151 bps [R. pumilum 2) to 220 bps {R. nobile) (Figure 1). The 5.8 S 
region had a length of 164 bps in all sequences. The ITS2 region 
ranged from 202 bps (R. franzenbachii 2) to 209 bps (R. nobile) 
annotated according to Keller et al. [71] (Table S3 in File SI). In 
the seven species, ITS2 regions of two versions between 2-3 
individuals of the same species, they were different at 7-16 bases, 
but with diflFerent frquency. Three putative repeats were detected 
using the 'Tandem Repeats Finder' with the default search options 
(alignment parameters 2, 7, 7, and minimum alignment score 50) 
(Table 2). These putative repeats were 17-46 bps in length, with 
2.1-3.0 copies, and 62-100 percent matches. Among these, two 
repeats were located in the ITSl region, and one in the ITS2 
(Table 2). The analysis of gene conversion was performed using 
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globulosum 



1 



Figure 1 . Schematic illustration of the distribution of substitution sites across the entire ITS region obtained from seven species of 
Rheum, using the R. globulosum ITS region as reference (red = T, purple = G, green = A, blue = C, yellow = gap). 

doi:10.1371/journal.pone.0089769.g001 
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Table 2. Tandem repeats found from ITS sequences of Rheum. 





Sequence label 


Consensus pattern 


Consensus size 


Copy number 


Percent matches 


location 


R. rhomboideum 


GACAGACCCGCGAACCCGTCTCT 
AACCCGCCGTCGGGGCGAGGGGG 


46 


2.0 


100 


ITSl 


R. macrophyllum 


AGCGGAGGAAAAGAAAC 


17 


2.1 


100 


ITS2 


R. pumilum 


CCCCTCCTGGCGCGCGCCAACCAAA 


25 


3.2 


62 


ITSl 



doi:1 0.1 371 /journal.pone.0089769.t002 



'Gene-conversion Software' on a basis of the ITS data of aU 30 
Rheum species and no gene conversion was detected. 

To check if any ITS copies might be pseudogenes, for each 
species with multiple copies, the GC content of each copy was 
calculated and compared with the Rumex cripus ITS region 
(including ITSl, 5.8 S, and ITS2) , which was designated as the 
presumed functional paralog. The GC content of ITS of all Rheum 
species possessed similar GC values in the spacers with Rumex cripus 
ITS (Accession number: AF338221, 67.17% in ITS (Table S3 in 
File SI). 

Furthermore, all three of the conserved motifs of 5.8 S, involved 
in ITS2 proximal stem formation, motif 2 (5'- GAATTGCA- 
GAATCC - 3'), present in all seed plants [82], motif 1 (5'- 
CGATGAAGAACGTAGC - 3') and motif 3 (5'- 
TTTGAACGCA - 3'), both present in all angiosperms [83-84] 
were present and unaltered all investigated sequences (Table S3 in 
File SI), indicating no loss of function regarding secondary 
structure formation, and hence that none of the ITS sequences 
obtained were pseudogenes. 

3.2 Phyiogenetic analyses 

When analysed separately, ITSl and ITS2 each resolved a 
monophyletic Rheum (Figure SI in File SI), as did a dataset with 
indels excluded (not shown). Support values were low for most 
nodes in each analysis, and most differences between the trees 
therefore might reflect imperfect resolution. Despite this, the very 
different position of each R. fmnzenbachii version between the two 
trees is noteworthy, and for this species recombination affecting 
one ITS version should not be ruled out. 

An initial phyiogenetic analysis on the combined dataset was 
conducted using all versions and accessions of ITS for all Rheum 
species (Figure S2 in File SI). However, for those that contained 
two barely divergent ITS versions, removing one of the two copies 
had no effect on support values or topology. Likewise, including 
only one accession per species, for those from which multiple 
accessions were sampled, had no effect on support values or 
topology. Therefore, the analysis was re-run using one randomly 
selected accession per species, and one randomly selected version 
from those with barely divergent ITS versions, but both versions of 
those with highly divergent ITS versions. 

The Maximum likelihood (ML) trees based on nrDNA ITS and 
cpDNA are shown in Figure 2, including bootstrap and Bayesian 
support values. AU species from RJieum comprised a monophyletic 
clade sister to the two-species genus Oxyria with high support 
values of 100% (Figure 2). As with cpDNA data [14], four major 
tentative clades (A, B, C, D) were recovered although the support 
values for some of them remain low (Figure 2). Within these clades, 
phyiogenetic structure was resolved within B and C, although only 
a few nodes in each had strong support (Figure 2). 

The distinct versions from each of the seven species [R. hotaoense, 
R. officinale, R. tanguticum, R. pumilum, R. likiangense, R. fmnzenbachii 
and R. reticulatum) did not cluster together, but nested into different 



clades with those from other species. Three of them [R. hotaoense, R. 
officinale, R. tanguticum) had ITS version 1 and cpDNA of Clade A, 
but ITS version 2 within clade C; R. frmzenhachii was similar 
except that ITS version 2 was in Clade B (Figure 2). R. pumilum 
also had ITS version 1 in Clade A, but had both ITS version 2 and 
cpDNA in Clade B. R. reticulatum had ITS version 1 in clade B, but 
version 2 and cpDNA in Clade C. Most strikingly, R. likiangense, 
was in Clade A for cpDNA, but its two ITS versions were in 
Clades B and C, respectively (Figure 2). 

In addition to these, two species displayed incongruence 
regarding their clade membership within the two phylogenies. R. 
lhasense was in Clade B for ITS but Clade A for cpDNA, whereas 
R. globulosum was in Clade C for ITS but Clade B for cpDNA. 
There was also incongruence within Clade C, with for example R. 
moorcroftianum strongly supported as sister to R. spiciforme for ITS, 
but R. reticulatum for cpDNA. Incongruence also occurs within 
Clade B, but is difiicult to interpret because clade composition is 
very different between the two trees (Figure 2). 

3.3. Pairwise distances 

For each of the seven species containing highly divergent 
versions of ITS, the p-distance between versions was greater than 
that between each version and the most similar ITS sequence 
found in another species (Table 3), indicating likely acquisition of a 
second version via hybridization [80]. Indeed, within-species p- 
distance was between 19 and 35 for these seven species, whereas 
the highest between-species p-distance detected was 16 (Table 3). 

Discussion 

ITS data revealed seven instances of Rheum species containing 
two divergent ITS versions, which resolved in different phyioge- 
netic positions in each case (Figure 2), indicating hybridization. 
Furthermore, incongruence between ITS and a cpDNA phylog- 
eny [14] (Figure 2), regarding the position of R. lhasaense and R. 
globulosum, revealed two further instances of hybridization, making 
nine in total. Other cases of minor incongruence within clades 
could indicate further hybridization events but other explanations 
such as lineage sorting effects are possible for these. In each of R. 
nobile, R. nanum and R. alexandrae , single accessions were found to 
contain two very similar but not identical versions of ITS; these 
might be due to ITS variation within species [85] (e.g. divergence 
between populations followed by gene flow) but do not provide 
additionally reliable evidence for interspecific hybridization. 

4.L Multiple ITS versions in seven Rheum species from 
the QTP 

The presence of two ITS versions in seven Rheum species {R. 
hotaoense, R. officinale, R. tanguticum, R. pumilum, R. likiangense, R. 
franzenhachii and R. reticulatum) indicates that in each case a second 
version has been acquired, and that concerted evolution [86-88] 
has not yet had time to reduce the number of version back to one. 
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Rheum pumilum 1 


An 


Rheum nanum 
Rheum wittrockii 


■ 


Rheum compactum 
Rheum hotaoense 1 


1 


Rheum franzenbachii 1 




Rheum rhaponticum 




Rheum altaicum 


m 


Rheum officinale 1 


An 


Rheum palmatum 
Rheum tanguticum 1 


2n 
2n 


Rheum alexandrae 


? 


- Rheum alpinum 


2n 
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Figure 2. The phylogenetic trees reconstructed using maximum liiielihood metKiod on a basis of nrDNA ITS matrix (left. Log- 
likelihood = -6967.11) and cpDNA matrix (right, Log-lilceiihood = —30277.42). Bootstrap support values from ML analyses using PHYML 
are given below branches and the corresponding Bayesian posterior probabilities from Bayesian analyses using IVlrBayes are shown above branches. 
For simplification, three monophyletic clade A1, A, B, C, D, were marked and also a paraphyletic group A2 on the cpDNA tree. On the ITS tree, four 
clades and ploidy of each Rheum species was marked and the seven species with multiple clones were also marked with clone serial numbers. The 
different colours branches were used to mark species with different characters, and the branches of glasshouse species was marked with triangle tag. 
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This situation has been frequently reported in Angiosperms and is 
usually attributed to hybridization [52,61-64]. An alternative 
hypothesis that the extra versions are ITS pseudogenes (see [88]) 
can be rejected because pseudogenes normally form a distinct 



clade, having a single common ancestor sequence at the time of 
duplication, typically before species diversification [89-93]; this 
was not the case for Rheum. Furthermore, none of the ITS 
sequences for the species concerned had lost the conserved seed 



Table 3. p-distances between ITS within and between species, involving species with two divergent versions of ITS. 
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plant specific 14-bp motif [82] and otlier two motifs in 5.8 S [83- 
84], or had the lower GC content (Table SI in File SI), both of 
which are expected for pseudogenes. Additionally, we sequenced 
ITS sequences from 2-3 individuals for the seven species to 
confirm that the multiple copies were not from the sequencing 
error, and the two versions per species each clustered into two 
clades with distinct phylogenetic positions (Figure S2 in File S2). 

The number of generations necessary for concerted evolution to 
reach completion is thousands to millions [87,90,94—97]. The 
generation time in Rheum can be several decades, due to long-lived 
roots and rhizomes [14,33]. Therefore, the hybridization events 
revealed by these multiple ITS versions could be as much as 
several million years old, and if clonal reproduction makes genets 
live even longer, they could be older still [52,61-64]. However, 
recent hybridization is also possible in each case (see below). 

All of the seven species with two ITS versions were from the 
QTP. Among these, two were tetraploid {R. ojjkinak and R. 
pumilum), four were diploid, and the chromosome number of R. 
hotaoense is unknown [43] . The tetraploids might be allopolyploids, 
but could also have received the second ITS copy via introgression 
following autopolyploidization [28] . Likewise the diploids could be 
homoploid hybrid species or have acquired a second ITS copy via 
introgression. Hence in each case three hypotheses are possible: 
hybrid speciation (allopolyploid or tetraploid depending on the 
species), ancient introgression, and recent introgression. However, 
despite this clear evidence for hybridization, the source species for 
the second ITS versions were not clear. A population genetic 
strategy based on more samples through its range and the next- 
generation sequencing method should help us to resolve the hybrid 
history of the 7 species more clearly. 

4.2. Other evidence of hybridization in Rheum 

In addition to these, we found that R. glohulosum and R. lhasense 
had different phylogenetic positions on the ITS tree to the cpDNA 
tr(;e (Figure 2). The differing relative positions within Glade B oiR. 
fonestii and R. sublanceolatum between ITS and cpDNA also suggest 
possible hybridization for all of these, although the clade 
composition is so different between the trees that it is difficult to 
interpret (Figure 2). Small dififerences between topology within 
Glade G are also evident, e.g. R. moocroftianum has a different sister 
species in each one. This could reflect lineage sorting in a young 
clade, or recent hybridization. In particular, tlu- possibility of 
shared haplotypes between species [98-101] needs to be examined 
in light of this result. 

It is noteworthy that neither of the two ITS copies for R. 
likiangense (in clades B and G) has the same phylogenetic position as 
its cpDNA, which is basal to Glade A. This could reflect more than 
one hybridization event in its history, and strongly indicates that at 
least one such event was not recent. 

It is likely that not all hybridization events have been detected 
by this analysis. While concerted evolution towards the paternal 
ITS type would produce incongruence such as seen here in R. 
glohulosum [102], concerted evolution towards the maternal type 
would remove the signature of hybridization producing congruent 
phylogenetic positions [48]. 

4.3 Ancient or recent hybridization in Rheuml 

Ancient introgression is very difficult to distinguish from hybrid 
speciation, unless large numbers of independent markers are 
available [103]. This study has uncovered such extensive evidence 
of hybridization in Rheum, that a much larger, and more 
comprehensive data set will be required to unravel it. Of 24 
species from the QTP, 9 show unequivocal evidence of 
hybridization, and there is tentative evidence in many more. 



There are no cases where pairs or groups of hybridized species are 
monophyletic for both cpDNA and ITS, therefore there is no 
evidence in our data for a hybrid speciation event followed by 

subsequent speciation. However, R. officinale and R. tanguticum are 
closely related for both markers, so the possibility cannot entirely 
b(; ruk'd out. 

Recent introgression by a single event would lead to all 
individuals having identical sequences for the captured ITS 
version, which is not what was observed in any of the species 
with two divergent ITS versions. Hence if the second versions were 
acquired by recent introgression then there were multiple, 
independent introgression events in each affected species. 
Furthermore, if all instances of multiple ITS versions resulted 
from recent introgression, then in each case one ITS version would 
match that for another species, the ITS donor. However, no two 
ITS sequences detected in this study were identical and the most 
similar between two species were R. tanguticum and R. palmatum 
which differed at 4 loci, which might be due to the limited sample 
size and the mutation after introgression for ITS loci was unlikely 
the parsimonious explanation. Although not all members oi Rheum 
were examined, those species not examined tended to be narrowly 
distributed species, and hence unlikely to be recent ITS donors for 
the accessions with duplicated ITS. Although not unequivocal, this 
evidence makes it likely that at least some cases of dupUcated ITS 
reflect ancient hybridization events. In particular, the incongru- 
ence between both ITS copies and cpDNA for R. likiangense cannot 
easily be explained by recent hybridization. 

Morphology also provides clues to past events. Within ITS clade 
G, all species are decumbent except for four whose second ITS 
copy fails within this clade. Three of these species [R. officinale, R. 
hotaoense and R. tanguticum) have both cpDNA and their other ITS 
copy from Glade A, and also the non-decumbent habit of this 
clade. From this, introgression of ITS from a Glade C lineage 
might be a likely hypothesis for their origin. A similar but more 
complex origin involving a third lineage might apply to R. 
likiangense. Gonversely, R. reticulatum has ITS from both clades B 
and C, but contains both cpDNA and the decumbent habit of 
Glade G, indicating possible introgression of ITS from Clade B. 
Hence ITS is also consistent with a decumbent habit being 
ancestral in Glade C, and cpDNA remains congruent with 
decumbent habit in all five cases above. However, R. glohulo.mm has 
the ITS and decumbent habit of Glade C but the cpDNA of clade 
B, indicating chloroplast capture as a possible past event for this 
species. Hence the presence of this species in cpDNA clade B does 
not indicate an independent origin for the decumbent habit in this 
clade, as previously thought [14]; instead it is more consistent with 
a reticulation event. Nonetheless, three separate origins or 
decumbent habit (Glade G, R. alpinum and R. tibeticum) are 
confirmed by the ITS data, and the two separate origins for 
glasshouse morphology (R. nobile and R. akxandrae) [14] are also 
supported. 

Minor incongruence within clades, notably Clade C, might 
reflect more recent and ongoing hybridization events, such as the 
sharing of plastids between species, which is common among 
rapidly radiated groups [99-100] and other species-rich genera 
[101-102]. Therefore, extensive sampling of different individuals 
and populations of Rheum around the QTP, for both cpDNA and 
ITS, will be necessary to tease apart the effects of recent and 
ancient hybridization events. For now we can conclude that 
hybridization among QTP species of Rheum has been extensive, 
and recent enough for concerted evolution not to have acted in 
many cases. 
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Supporting Information 

File SI Figure SI, The phylogenetic trees reconstructed using 
maximum likelihood method on a basis of nrDNA ITS 1 (left) and 
ITS2 matrix (right), respectively. Bootstrap support values from 
ML analyses using PHYML are given below branches and the 
corresponding Bayesian posterior probabilities from Bayesian 
analyses using MrBayes are shown above branches. Figure S2, 
The phylogenetic trees reconstructed using maximum likelihood 
method on a basis of nrDNA ITS matrix including extra 
sequences from more individuals. Bootstrap support values from 
ML analyses using PHYML are given below branches and the 
corresponding Bayesian posterior probabilities from Bayesian 
analyses using MrBayes are shown above branches. The letter 
(X, Y, Z) after the species name present different individuals, and 
the numbers mean clone order. Table SI, Plant materials and list 
of accession numbers for the taxa used in the present study. The 
intron of trnK includes the matK gene and non-coding segments; 
rbcL-accD and trnL-F are intergenic spacers. Table S2, The 
frequency of two versions from all positive clones per Rheum 
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